Dean Adams, Iowa State University
07 October, 2019
“The study of form may be descriptive merely, or it may become analytical. We begin by describing the shape of an object in simple words of common speech: we end by defining it in the precise language of mathematics; and the one method tends to follow the other in strict scientific order and historical continuity.”
To properly study shape we require a definition of it
For much of the history of biological inquiry, form and shape were synonymous
Plato & Aristotle: believed that understanding form lead to understanding of function
Geothe: ‘morphology’ the form of an organism
Form and shape historically used interchangeably
Form = size + shape (current definition)For centuries, biologists used morphology to compare organisms
Species taxonomy, classification, and ecological specialization were all addressed using anatomical properties
However, morphology was typically described qualitatively, leading to limitations
Much of the richness of morphological variation not captured
Limits comparisons to 2 objects (same or different) For > 2 objects, only rank-orderings could be done (A >B >C)
Rank-ordering implies differences on a continuous scale
Thus, there was a need to quantitatively compare morphology
Use of quantitative data began in the late 1800s: variables were measured (e.g., wing length) and group means compared
Morphometric advances concurrent with statistical advances
LATE 1800s – EARLY 1900s
Galton: variation, correlation, regression
Pearson: more on correlation, \(\small\chi^2\)
From this the Biometric tradition was formed, believing that most of biological variation was continuous
Early 1900s: BIG biological debate: Biometric vs. Mendelian (continuous vs. discontinuous biological variation)
Statistics developed (as a field) with evolutionary biology
ANOVA / Variance partitioning (Fisher: for quantitative genetics and more generally)
Discriminant Function Analysis (DFA/CVA), MANOVA,
Generalized distance (Fisher, Mahalanobis, Rao)
Principal Components Analysis (Pearson, Hotelling)
Multivariate t-test (Hotelling)
Factor Analysis (Spearman)
Path Analysis (Wright)
Multivariate analysis of morphological measurements
Within- and between-group variation described and compared
Commonly used methods: principal components, canonical variates, discriminant function (see Blackith and Reyment, 1971)
Multivariate analysis of morphological measurements
Within- and between-group variation described and compared
Commonly used methods: principal components, canonical variates, discriminant function (see Blackith and Reyment, 1971)
1980s-1990s: Theoretical advances and data types caused radical shift in morphometric methods
Geometric Morphometric Methods (GMM) emerged
Rigorous quantification of shape coupled with sound statistics and graphical visualization
Emphasis on landmarks due to more formal statistical theory
Procrustes Paradigm: rigorous quantification of shape from points, curves, and surfaces
Analysis of shape variation with graphical description of results
Morphometrics: the study of shape variation and its covariation
How to quantify shape? There are no natural units for shape
Morphometrics: the study of shape variation and its covariation
Morphometrics: the study of shape variation and its covariation
Data should
Register shape in a repeatable manner and archive it for statistical analyses
Contain enough information to reconstruct a graphical representation of the structure of interest
Be appropriate to address the biological question of interest
Morphometric data may quantify:
The location of discrete anatomic features (points)
Distances between anatomic features (lines)
Outlines of a structure (curves)
The surface of a structure (surfaces)
Data must quantify shape in a repeatable manner
Homology: correspondence across specimens
Evolutionary homology: structures derived from the same tissue of the most recent common ancestor
Operational homology: correspondence in the position of structures
In Morphometrics, we may use both, depending on the question to answer
The important point is to obtain repeatable data
Use linear measurements, angles, etc., between anatomic features, or structures (e.g. head width)
Use some method to eliminate non-shape variation (size)
Analyze the data using multivariate statistics
aka ‘traditional’ or “multivariate” morphometrics
Advantages
Allow comparison to previous studies
The variables used are very intuitive and easy to interpret biologically
Disadvantages
Size is a latent factor and there is no global consensus on how to account for it – different approaches provide different results
The same values may represent different shapes
Homology is difficult to evaluate and guarantee
Usually it is difficult to obtain a graphical representation of variation in shape (because the geometry of the structure is not preserved in the analysis)
Something is missing! The relative positions of the distances on the structure
An interconnected network of distance measurements between anatomical points
An effort to maintain the relative positions between traits
Can provide a graphical representation
BUT statistical treatment is problematic
In the 1980s, there is a radical shift in methodology
Linear measurements do not capture all of the properties required to describe shape
The truss was an attempt to record relative positions
This information is inherent in the common endpoints of linear distances
Change from linear measurements to the locations of points themselves
Use landmark coordinates as the raw data
This advance was aided by the simultaneous development of mathematical shape theory
Use x, y (or x, y, z) coordinates of anatomic locations to quantify anatomical variation
Eliminate non-shape variation (size and others, see further on)
Multivariate data analysis
Advantages
Easy to assess homology, so biological interpretation is more robust
The geometry of the studied structure is maintained in relative landmark locations
Can obtain a visual representation of shape variation
Disadvantages
Some structures have few, or no, landmarks
One may be interested in outlines, or other curves, to quantify shape
Advantages
Useful when there are few, or no, landmarks
They register a full representation of the outline (and therefore we can obtain a graphical representation of shape variation)
Disadvantages
Difficult to assess homology
Several analytical approaches exist, which provide different variables, and comparison among them is not straightforward (there is no globally accepted way to measure the shape of a curve)
Most methods do not allow the combination of curves and landmarks
Some curves may represent different shapes
They resolve the problems encountered when working with curves
Can be quantified on curves (2D) or surfaces (3D)
They are treated as “degenerate” landmarks, the position of which is limited to slide on a curve (or surface) during analysis
Only variation perpendicular to the curve is relevant
Advantages
Good representation of curves, outlines and surfaces
They allow the examination of curves and surfaces in Procrustes shape space (which has known statistical properties)
It is possible to combine different morphological features (points, curves, surfaces) in a single analysis of form
Combining landmarks and semilandmarks there is a general solution for quantifying biological shape variation
We can combine it all in a single analysis!
Sometimes landmarks are missing
Methods to estimate their location exist (discussed later this week)
Sometimes objects display symmetry
GMM procedures adjusted slightly for this (discussed later this week)
Methods are a product of theory and data type
Choose methods that are mathematically sound, but also theoretically and biologically sound
METHODS SHOULD NOT INTRODUCE PATTERNS!!!!